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Transmissible gastroenteritis virus (TGEV) is an enteropathogenic coronavirus isolated for the first time in 1946. 
Nonenteropathogenic porcine respiratory coronaviruses (PRCVs) have been derived from TGEV. The genetic relation¬ 
ship among six European PRCVs and five coronaviruses of the TGEV antigenic cluster has been determined based on 
their RNA sequences. The S protein of six PRCVs have an identical deletion of 224 amino acids starting at position 21. 

The deleted area includes the antigenic sites C and B of TGEV S glycoprotein. Interestingly, two viruses (NEB72 and 
TOY56) with respiratory tropism have S proteins with a size similar to the enteric viruses. NEB72 and TOY56 viruses 
have in the S protein 2 and 15 specific amino acid differences with the enteric viruses. Four of the residues changed (aa 
219 of NEB72 isolate and aa 92,94, and 218 of TOY56) are located within the deletion present in the PRCVs and may be 
involved in the receptor binding site (RBS) conferring enteric tropism to TGEVs. A second RBS used by the virus to 
infect ST cells might be located in a conserved area between sites A and D of the S glycoprotein, since monoclonal 
antibodies specific for these sites inhibit the binding of the virus to ST cells. An evolutionary tree relating 13 enteric and 
respiratory isolates has been proposed. According to this tree, a main virus lineage evolved from a recent progenitor 
virus which was circulating around 1941. From this, secondary lineages originated PUR46, NEB72, TOY56, MIL65, 

BRI70, and the PRCVs, in this order. Least squares estimation of the origin of TGEV-related coronaviruses showed a 
significant constancy in the fixation of mutations with time, that is, the existence of a well-defined molecular clock. A 
mutation fixation rate of 7 ± 2 x 1CT 4 nucleotide substitutions per site and per year was calculated for TGEV-related 
viruses. This rate falls in the range reported for other RNA viruses. Point mutations and probably recombination events 
have occurred during TGEV evolution. © 1992 Academic Press, inc. 


INTRODUCTION 

Transmissible gastroenteritis virus (TGEV) belongs 
to one of the two major antigenic groups of mammalian 
coronaviruses (Siddell eta!., 1982; Spaan etal., 1988). 
The virus was first isolated in 1946 (Cox et al., 1990a; 
Doyle and Hutchings, 1946). It is an enteropathogenic 
coronavirus which replicates in both villus epithelial 
cells of the small intestine and in lung cells. In 1984, a 
nonenteropathogenic virus related to TGEV, the por¬ 
cine respiratory coronavirus (PRCV) appeared in Eu¬ 
rope (Pensaertef a/., 1986; Callebaut etal., 1988). This 
virus replicates to high titers in the respiratory tract and 
undergoes only limited replication in unidentified sub¬ 
mucosal cell types of the small intestine (Cox et at., 
1990a,b). A virus similar to the European PRCV has 
been recently described in North America (Wesley et 
a!., 1990b). In contrast to TGEV, PRCV exhibited no 
clinical signs of disease (Pensaert etal., 1986; Duret et 
al., 1988; Wesley etal., 1990b). 

The antigenic cross-reaction among isolates of 
TGEV and PRCV has been clearly documented (Calle¬ 
baut et al., 1988; Garwes et al., 1988; Sanchez et al., 
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1990; Rasschaert etal., 1990; Wesley etal., 1990b). 
Both types of viruses have common antigenic determi¬ 
nants in the three structural proteins: spike (S), mem¬ 
brane (M), and nucleoprotein (N). The absence of two 
antigenic sites in the S protein of the PRCV isolates has 
been the base for their differentiation from the enteric 
viruses (Sdnchez et al., 1990). Sequencing of the S 
gene of a French PRCV isolate (Rasschaert et al., 
1990), and of a 200-nucleotide (nt) fragment of the S 
gene of a North American PRCV isolate (Wesley et al., 
1990a) has revealed that both S proteins contain, at 
comparable locations within the protein, a single dele¬ 
tion of 224 and 227 amino acids, respectively. These 
isolates also showed deletions which were different in 
each virus in the genes coding for the nonstructural 
proteins, mapping downstream to the 3'-end of the S 
gene (Britton, 1990; Rasschaert etal., 1990; Wesley et 
al., 1991). PRCV was transmitted by aerosols and has 
now been detected in most European countries (En- 
juanes and Van derZeijst, 1992). It has been proposed 
(Enjuanes and Van der Zeijst, 1992) that PRCV be¬ 
haves as a natural vaccine against TGE, which makes 
the study of its origin and evolution interesting. The 
analysis of the genetic relationship among these respi¬ 
ratory isolates and others with respiratory tropism 
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would allow us to determine the molecular basis of 
their tropism and evolution. 

In this manuscript we describe the genetic homol¬ 
ogy among eight respiratory and three enteric isolates 
of the TGEV antigenic cluster, which identified amino 
acids potentially involved in receptor binding sites and 
conserved areas of the S gene. Based on these viral 
sequences, an evolutionary tree and mechanisms for 
TGEV evolution have been proposed. 


MATERIALS AND METHODS 
Cells and viruses 

All viruses were grown in swine testicle (ST) cells 
(McClurkin and Norman, 1966). The characteristics of 
the viruses are described in Table 1. For simplicity, the 
viruses are named in the text with three letters indicat¬ 
ing their geographical origin or classical name, fol¬ 
lowed by two numbers indicating the earliest year of 
isolation as reported in the literature. The antigenic 
characteristics of most of these viruses have been pre¬ 
viously reported (Sanchez et al., 1990). Viruses were 
purified as described (Correa et al., 1990). 

Virus proteins 

Protein analysis was performed after dissolution (1 
yy.g/20 n I) in 0.1 M sodium acetate, pH 7, 0.5% sodium 
dodecyl sulfate (SDS), 1 nM phenylmethylsulfonyl fluo¬ 
ride (PMSF), 0.1 fiM /V-a-p-tosyl-L-lysine chloromethyl 
ketone (TPCK), and 1 /tg/ml pepstatin. When indicated, 
proteins were deglycosylated by incubation overnight 
at 37° with protein A/-glycosidase F (0.04 U/ml, 
Boehringer-Mannheim), and the reaction was stopped 
by freezing. Protein were subjected to SDS-7.5% poly¬ 
acrylamide gel electrophoresis (PAGE) after the sam¬ 
ples were reduced with 5% 2-mercaptoethanol 
(Laemmli, 1970). Finally the proteins were detected by 
silver staining (Ansorge, 1985). 

RNA sequencing 

RNA was extracted from purified virions as de¬ 
scribed by Gebauerefa/. (1991). RNA was sequenced 
by oligodeoxynucleotide primer extension and dideox- 
ynucleotide chain termination procedure (Sanger et al., 
1977) using the protocol described by Fichot and Gir¬ 
ard (1990). For RNA sequencing, primers complemen¬ 
tary to the S gene (Gebauer et al., 1991) were used. 
Sequence data were assembled using the computer 
programs of the Genetics Computer Group (University 
of Wisconsin). 


Evolutionary tree 

Sequence information has been analyzed following 
standard phylogenetic methods. The distance be¬ 
tween each pair of nucleotide sequences was esti¬ 
mated using the formula d = — (f)ln(1 - 4 p/3) L (Jukes 
and Cantor, 1969), where p is the proportion of 
changed nucleotides displayed by the compared se¬ 
quences, and L is the length of the sequences after 
alignment. The two gaps introduced to align the se¬ 
quences were excluded from the calculations. The 
neighbor-joining method (Saitou and Nei, 1987; Sour- 
dis and Nei, 1988), as implemented in the program 
TREEDIST (available from J.D. upon request), was used 
to obtain a phylogenetic tree from the pairwise dis¬ 
tance matrix. A parallel phylogenetic analysis was 
carried out using the least squares method (Fitch and 
Margoliash, 1967), utilizing the program FITCH from 
the PHYLIP package, version 3.3 (Felsenstein, 1990). 
The reliability of the tree, i.e., the confidence levels for 
branching order, was determined by the bootstrap 
method (Efron, 1982; Felsenstein, 1985). A high num¬ 
ber of bootstrap replicates of the original set of se¬ 
quences was obtained. For each replicate a phylogen¬ 
etic tree was obtained as described above. Hence, a 
consensus topology for the tree, as well as confidence 
intervals for each branching point (Felsenstein, 1985) 
were obtained by applying the program CONSENSE, 
also from the PHYLIP package. Automatized derivation 
of bootstrap replicates, distance matrices, and neigh¬ 
bor-joining tree estimations were provided by the 
TREEDIST program. 

The origin of the phylogenetic tree was estimated by 
a lineal least squares procedure (Sokal and Rohlf, 
1981). We assumed a constant average rate of fixation 
of mutations. This procedure determines the origin, 
finding the point in the tree that minimizes the sum of 
the squares of a lineal least squares fit, and relates the 
distances between each isolate and this point to isola¬ 
tion dates. The slope of the line provides an estimate of 
the rate of fixation of mutations. The interception of the 
line with the horizontal axis (time) gives an estimate of 
the origin of the TGEV antigenic cluster of viruses. 
Errors and confidence intervals were calculated for the 
slope and the intercept with the time axis (Sokal and 
Rohlf, 1981). 

RESULTS 

Structural proteins of enteric and respiratory 
porcine coronaviruses 

Both enteric and respiratory TGEVs have been stud¬ 
ied. The respiratory viruses could be grouped in two 
clusters, one lacking antigenic sites B and C (the 
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TABLE 1 


CORONAVIRUSES USED IN THIS PUBLICATION 


Designation 

Origin (year of isolation) 

Dominant tropism 

Characteristics 

Reference 

TGEV 





PUR46-MAD-CC120 

Purdue University, 

Enteric & Respiratory 

Enteric virus originally 

Bohl eta/., 1972 


Indiana (1946) 


isolated by Bohl. 

Sanchez et al. r 1990 




Passaged 120-fold on 

ST cells. Reference 
clone used in our 
laboratory. 

Gebauer et al. f 1991 

PUR46-PAR-CC120 

idem 

idem 

Same origin as PUR46- 

Bohl et ai, 1972 




MAD-CC120 Clone 
used by H. Laude’s 
group. 

Rasschaert and Laude 1987 

PUR46-UTR-CC120 

idem 

idem 

Same origin as PUR46- 

Bohl et ai, 1972 




MAD-CCl20 Clone 
used at Utrecht 

University 

Jacobs et ai, 1987 

MIL65-AME 

Ohio (1965 or before) 

idem 

Virulent. Passed in vivo 

Wesley, 1990 




Plaque purified three 
times on ST cells. 


BRI70-FS772 

England (1970) 

idem 

Maintained by passage in 

Garwes et ai., 1978 




primary cultures of 
thyroid cells 


NEB72-RT 

Nebraska (1972) 

Respiratory 

Isolated from the lungs of 

Underdhal et ai, 1974 




a healthy adult pig. 
Passaged in the lungs 
of gnotobiotic pigs. 
Passaged in vitro in 
lung cells and on ST 
cells. 

This manuscript 

TOY56-CC168 

Japan (1956) 

Respiratory 

Received at passage 163 

Furuuchi et ai, 1976 



(sporadically 

in swine kidney cells. 

Sanchez et ai., 1990 



isolated in enteric 

Passaged 5 times on 




tissues) 

ST cells. 


PRCV 





HOL87-V78-CC5 

The Netherlands (1987) 

Respiratory 

Originally isolated on ST 

Pensaert et ai., 1986 




cells and passaged 5 
times on this cell line 

Sanchez et ai., 1990 

BEL85-83-CC3 

Belgium (1985) 

idem 

idem 

idem 

BEL87-31-CC5 

Belgium (1 987) 

idem 

idem 

idem 

FRA86-RM4 

France (1986) 

idem 

idem 

Duret et ai., 1988 

Rasschaert et ai., 1990 

ENG86-I-CC5 

England (1986) 

idem 

Isolate PVC-135308 

Brown and Cartwright, 1986; 




originally grown on 

Garwes et ai., 1988; 




primary pig kidney cells 
and passaged 5 times 
on ST cells 

Sanchez et at., 1990. 

ENG8641-CC5 

England (1986) 

idem 

Isolate PVC-137004, 

idem 




isolated and passaged 
as PVC-135308 



PRCVs) and another with these antigenic sites (NEB72 
and TOY56) (SSnchez et at., 1990). The molecular 
weight of the structural proteins of the TGEVs and 
PRCVs listed in Table 1 were determined, with the ex¬ 
ception of those from the isolates BRI70 and FRA86, 
which have not been analyzed in this study. These mo¬ 
lecular weights were estimated by SDS-PAGE analy¬ 


sis. The mobility of the M and N proteins of all viruses 
was similar (data not shown). In contrast, the TGEV S 
glycoproteins and the apoproteins, obtained by degly- 
cosylation with protein A/-glycosidase, had higher mo¬ 
lecular weight (200 and 158 kDa, respectively) than the 
S glycoproteins and apoproteins of the PRCVs (170 
and 130 kDa, respectively). The results for the stan- 
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ENTERIC RESPIRATORY 

VIRUS _ VIRUSES _ 

PUR 46 NEB72 TOY 56 BEL85 BEL87 H0L87 



Fig. 1 . PAGE analysis of the spike protein of TGEV-related corona- 
viruses before and after deglycosylation. Purified viruses were disso¬ 
ciated (1 mQ/ 20 jul) in O.t M sodium acetate, pH 7, with 0.5% SDS 
and protease inhibitors, and incubated overnight at 37° in the pres¬ 
ence (+) or absence (-) of protein A/-glycosidase F (0.04 U/^ii). The 
proteins were separated by 7.5% PAGE in the presence of 0.1 % SDS 
and 2-mercaptoethanol and detected using silver staining (Ansorge, 
1985). Only the gel area corresponding to the S glycoprotein is 
shown. 


dard PUR46 strain, two TGEV strains (NEB72 and 
TOY56) with respiratory tropism, and three PRCVs 
(BEL85, BEL87, and HOL87 are shown as representa¬ 
tive data (Fig. 1). These results indicate that all the 
European PRCVs studied have an S protein of similar 
molecular weight (170 kDa), which is smaller than the 
S glycoprotein of TGEV. In addition, they demonstrate 
that other isolates with an almost exclusive respiratory 
tropism (NEB72 and TOY56) do not have a reduction in 
molecular weight as were detected in the PRCV iso¬ 
lates. 


Sequence analysis of the S-glycoprotein of TGEVs 
and PRCVs 

To determine the relationship among the different 
European PRCV isolates, the complete S gene se¬ 
quence of PRCV HOL87, TOY56, and NEB72 respira¬ 
tory isolates were determined by sequencing the RNA 
from purified virions (Fig. 2). In addition, the first 1956 
nt of the S gene of other four PRCV strains were deter¬ 
mined (Fig. 2). The nucleotide or amino acid positions 
reported in this manuscript refer to the location of 
equivalent residues in the sequence of MIL65 virus, 
which has the largest S gene reported for TGEV-related 
isolates. The 5'-terminal segments sequenced codes 
for the four antigenic sites previously defined, which 
are located in the globular part of the peplomer (Ge- 
bauer et al., 1991). The sequences were aligned with 
those of the PRCV FRA86 (Rasschaert etal., 1990) and 
of prototype enteric viruses (Fig. 2). Two deletions 


were observed which have been diagrammatically 
summarized in Fig. 3A. One of them removed 224 
amino acids, starting at residue 21 of the unprocessed 
glycoprotein. The second deletion removed 2 amino 
acids after residue 374. Taking into account the two 
deletions and the sequence homology among the S 
genes of these isolates, three sets of viruses could be 
distinguished: (i) one including BEL85, FRA86, HOL87, 
BEL87, ENG86-I, and ENG86-II strains with a 224-aa 
deletion which was identical both in terms of the num¬ 
ber of residues deleted and the location of the deletion; 
(ii) a second set including PUR46 and NEB72 isolates 
with a deletion of 2 aa; and (iii) a third set grouping 
MIL65, BRI70, and TOY 56 strains, which had no dele¬ 
tion. Although the NEB72 and TOY56 isolates have 
respiratory tropism, they do not contain the 224- 
amino-acid deletion. These viruses have point muta¬ 
tion differences with the enteric viruses (Fig. 2). The 
NEB72 isolate has only two amino acid differences 
when compared to the PUR46 isolate in the S protein, 
not shown by other enteric isolates. One of them (aa 
219) falls within the deletion present in the PRCVs. 
NEB72 isolate is closely related to PUR46 strain since, 
in addition, both viruses have the 2-aa deletion (resi¬ 
dues 375 and 376) and almost identical sequences in 
the ORFs 3, 3-1, and 4, corresponding to nonstructural 
proteins (data not shown). The TOY56 isolate has three 
amino acid changes (residues 92, 94, and 218) within 
the deletion present in the PRCVs, in relation with the 
PUR46 strain, which are specific for the TOY56 isolate. 
The enteric isolates BRI70-FS and MIL65-AME have 
also a change in aa 218, from valine to threonine, 
which is different than the change to isoleucine that 
occurred in the TOY56 isolate. 

The amino acid homology between the S protein of 
PRCVs and TGEVs was independently studied at the 
globular and the stem portion of the molecule (data not 
shown). The same overall degree of homology in the S 
proteins was found in both the globular and stem 
areas. The amino acid homology was higher than 98% 
among both the TGEV and the PRCV isolates. In con¬ 
trast, the overall S protein homology between TGEVs 
and PRCVs was around 1% lower. Although this per¬ 
centage difference is small, the fact that these viruses 
have the amino acids changed in almost identical loca¬ 
tion, makes this difference significant. In these compar¬ 
isons, only the S protein segments for which the se¬ 
quences of the 13 viruses were available have been 
considered. A large conserved domain was identified 
in the globular portion of the S protein of TGEVs and 
PRCVs, between amino acids 405 and 465, when the 
number of amino acid changes was plotted versus 
their position in the sequence (Fig. 3B). Furthermore no 
amino acid changes were detected in this segment 
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610 630 650 670 690 710 
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Fig. 2. Sequence alignment of spike (S) protein genes of TGEVs and PRCVs. The nucleotide sequence of the S gene and the deduced aa 
sequence of the PUR46-MAD virus are shown in the two first lines. In the other lines, the nucleotide changes in the sequences of other viruses 
have been indicated. Nucleotide changes resulting in amino acid changes have been shadowed. In the alignment deleted residues have been 
filled out with points. Sequence numbers indicate the positions that the residues would have in the MIL65 virus. For simplicity, the sequences of 
two clones of the PUR46 isolate (PUR46-PAR and PUR46-UTR) have been omitted in this series of sequences, since they show minor changes 
and their sequences were previously published. The sequences of the strains FRA86-RM, MIL65-AME, BRI70-FS, PUR46-PAR, and PUR46-UTR 
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have been previously reported (Britton and Page, 1990; Jacobs et al., 1987; Rasschaert and Laude, 1987; Rasschaert et a!., 1990; Wesley, 
1990). Sequence indeterminations have been coded as: Kfor G orT; Xfor G, A, T, C, or any amino acid; S forC or G; and Y for C orT. Underlined 
amino acids correspond to the signal peptide. Residues in boxes are involved in the indicated antigenic sites. Asterisks indicate the 3'-end of the 
segments sequenced. Dashes indicate nonsequenced segments. The nucleotide sequence data reported in this paper have been submitted to 
the GenBank nucleotide sequence database and have been assigned the accession numbers: PUR46-MAD, M94101; NEB72-RT, M94099; 
TOY56, M94103; HOL87, M94097; BEL85-83, M94096; BEL87-31, M94098; ENG86-I, M94100; ENG86-II, M94102. 
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when the sequences of 13 virus isolates were com¬ 
pared (Fig. 3B). 

Evolutionary tree for the S gene of TGEVs and PRCVs 

The nucleotide sequence of the S glycoprotein of 
eight respiratory and five TGEVs (three of which were 
different clones of the same PUR46 virus strain) were 
aligned taking into account the two deletions of 6 and 
672 nts present in the sequence of the PUR46 and 
PRCVs, respectively, for maximum fitness (Fig. 2). Phy¬ 
logenetic analysis of the sequences (first 1956 nt) of 
the viruses described in Fig. 2, by either the neighbor¬ 
joining or the least squares methods of tree-recon¬ 
struction procedures, gave two identical trees, with the 
same branching order, confidence levels, and branch 
lengths (Fig. 4). This congruence in the results, in addi¬ 
tion to the high confidence level along the tree, sug¬ 
gests a significant reliability for the evolutionary history 
described. The least squares relationship between the 
number of mutations from origin and the year of isola¬ 
tion was determined (Fig. 5). The extrapolation of this 
line to zero mutations allowed to predict that these 
TGEV were originated from a recent common ancestor 
circulating around 1941. Since then, from a main lin¬ 
eage, the PUR46, TOY56, MIL65, BRI70, and the 
PRCVs were derived in the indicated order (Fig. 4). Only 
one isolate (NEB72) accumulated a number of substi¬ 
tutions smaller than the one expected for its year of 
isolation. In at least three cases (TOY56, MIL65, and 
BEL85), it can be assured with a significance of 99.9%, 
that these were lateral lineages derived from one main 


lineage (see Discussion). The accumulation of muta¬ 
tions with time (Fig. 5) fits a straight line with a high 
Pearson coefficient correlation (r 2 = 0.97). From the 
slope of this line, the mutations fixation rate can be 
estimated at 0.95 ± 0.05 substitutions per year. 

DISCUSSION 

The structural proteins of seven new strains of the 
TGEV cluster with enteric and respiratory tropism have 
been analyzed. Also, the complete sequences of the S 
genes of three respiratory isolates and of the first 
1956-nt S gene of other four respiratory viruses of the 
TGEV antigenic cluster have been determined. These 
sequences, together with published ones of enteric 
and respiratory TGEV isolates, have been analyzed to 
determine the genetic homology between TGEVs and 
PRCVs. Key point mutations which might be responsi¬ 
ble for the loss of enteric tropism in certain isolates 
have been identified. In addition, a large conserved 
area in the S protein has been identified, and an evolu¬ 
tionary tree relating all these viruses has been pro¬ 
posed. 

TGEVs were described for the first time in 1946 
(Doyle and Flutchings, 1946). Respiratory variants of 
the enteric virus were isolated in 1956 (TOY56 strain) 
(Furuuchi etal., 1976) and in 1972 (strain NEB72)(Un- 
derdahl et a/., 1974). FHighly contagious respiratory iso¬ 
lates which rapidly extended throughout Europe, the 
PRCVs, were detected for the first time in 1984 (Pen- 
saert et al., 1986). These viruses are serologically re¬ 
lated to TGEV and are missing antigenic sites B and C 
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Fig. 3. Summary of the deletions and amino acid changes present 
in the S glycoprotein of TGEVs and PRCVs. (A) Location of the dele¬ 
tion. Full and empty bars indicate the sequences known and unde¬ 
termined, respectively. Letters indicate the approximate location of 
the antigenic sites. The numbers above these letters indicate amino 
acid residues involved in the formation of these sites. The position of 
the deletions is indicated by brackets, and the numbers next to the 
brackets show the amino acids flanking the residues deleted. (B) 
Number of amino acid changes in sequential fragments of 20 aa 
each, in relation with the PUR46-MAD virus sequence. Only the seg¬ 
ments for which the sequences of the 13 virus strains were available 
have been included in the comparison. Amino acid residues have 
been numbered according to their position in the MIL65 virus after 
the alignment. The origin of the sequences of the different strains 
has been indicated in the legend of Fig. 2. 


(Callebaut eta!., 1988; Sanchez etah, 1990). In the five 
European isolates sequenced by us, the absence of 
these sites is due to a deletion of 224 amino acids, 
starting at residue 21. Identical deletion (both in terms 
of size and location) was described for another Euro¬ 
pean isolate (Rasschaert et at., 1990). In 1989 a virus 
(IND89) with the antigenic characteristics of the Euro¬ 
pean PRCVs was isolated in the United States (Wesley 
et at., 1990b). Sequencing of the first 200 nt of the S 
gene showed a deletion of 227 amino acids starting at 
residue 23 (Wesley et at., 1991), i.e., the deletion 
shifted downstream two residues, in relation to the po¬ 
sition of the deletion described for the European 
PRCVs. These data, together with the high sequence 
homology (Figs. 2), and the phylogenetic tree obtained 
(Fig. 4) demonstrate that all six European PRCVs, iso¬ 
lated in four countries (Belgium, France, The Nether¬ 
lands, and United Kingdom) have a recent common 


ancestor. In contrast, the North American isolate is 
probably of independent origin, since (i) if it was derived 
from the European PRCVs the addition of several nu¬ 
cleotides after nt 59 or 60 and a deletion of a few nu¬ 
cleotides at the end of the deletion present in the Euro¬ 
pean PRCVs would have been required (the identity of 
the nucleotides at the beginning and at the end of the 
deletion leaves open the precise position of the dele¬ 
tion both in the European and in the North American 
PRCVs); (ii) differences between the genes coding for 
the nonstructural proteins of the European and North 
American isolates have also been reported (Rass¬ 
chaert et a/., 1990; Wesley et al., 1991); and (iii) the 
200-nt sequence available for the North American iso¬ 
late placed this PRCV strain closer to the enteric iso¬ 
lates than to the European PRCVs in our evolutionary 
tree (results not shown). 

The four antigenic sites described in the S glycopro¬ 
tein of TGEV have been mapped into the NH 2 -terminus 
half of S protein (Delmas et al., 1990; Gebauer et al., 
1991). These sites are probably located in the globular 
part of the S molecule (De Groot era/., 1987; C. Suhe, 
M. Nermut, J. L. Carrascosa, and L. Enjuanes, unpub¬ 
lished results). In other coronaviruses the S glycopro¬ 
tein can be split into two subunits, SI and S2, which 
probably contain the globular and stem portions of the 
molecule, respectively (Spaan et al., 1988). The pre¬ 
cise residue where the stem part of the S peplomer 
might start awaits elucidation of its atomic structure. 
The protein domain that includes antigenic subsites Aa 
and Ab and site D (Gebauer et al., 1991) showed a 
slightly higher number of amino acid changes than did 
other areas of the S protein (Fig. 3). Nevertheless, the 
overall homology in the globular and stem areas of the 
S protein is similar in both nucleotide and amino acid 
levels (results not shown). This result was not antici¬ 
pated due to the higher antigenicity and presence of 
epitopes relevant in virus neutralization in the globular 
area. 

The receptor binding site in the S glycoprotein of 
TGEV that interacts with ST cells probably maps be¬ 
tween sites A and D since TGEV binding to ST cells is 
best inhibited by MAbs specific for these sites (Sufi6 et 
al., 1990). Candidate domains for the localization of 
this receptor binding site could be the highly con¬ 
served area identified between amino acids 405 and 
465 (Figs. 2 and 3), although other domains around 
this area can not be ruled out. A second RBS may be 
used to infect enteric cells by TGEVs with enteric tro- 
pism. This RBS might be located within the area of the 
S protein deleted in the PRCVs. More precisely, it could 
be located around either aa 92, 94, and 218 or aa 219, 
changed in the TOY56 or in the NEB72 isolates, re¬ 
spectively. Interestingly, both viruses have an amino 
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Fig. 4. Evolutionary tree ofTGEV related coronaviruses. Neighbor-joining and least squares methods of tree reconstruction procedures were 
applied to the first 1956 nt of 13 virus isolates (the 11 isolates indicated in Fig. 2, and the clones PUR46-PAR and PUR46-UTR previously 
reported) (Table 1). Numbers in the diagram indicate residue substitutions between branching points. A indicates the introduction of a deletion 
between branching points. Indicates that all the descendents of this fork have, with a probability of 99.9%, a recent common ancestor. 


acid change in contiguous residues (218 and 219), 
suggesting that these residues may be involved in the 
RBS. Tissue-specific tropism of coronaviruses is con¬ 
ditioned by the S glycoprotein, and different RBSs in 
this protein could be recognized by the respiratory or 
the enteric tissues. Nevertheless, other viral or cellular 
regulatory mechanisms affecting essential steps of 
virus replication, other than virus-to-cell binding, could 
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Fig. 5. Relationship between mutation fixation rate and year of 
isolation. The line reiating the number of mutations from origin with 
the year of isolation was plotted. Line and origin were estimated at 
the same time by linear least squares fit. The expression for the line 
was: d = 0.95 t - 1893, r 2 = 0.97, where d is the distance to the 
origin, t is the time in years, and r 2 the Pearson’s correlation coeffi¬ 
cient (Sokai and Rohlf, 1981). The data correspond to the viral iso¬ 
lates used in the construction of the evolutionary tree (Fig. 4). The 
line with an minimum square error was determined and represented. 
The point showing minimum fitness with the line corresponds to the 
NEB72 isolate. 


influence viral tropism (Levine, 1984). Genes control¬ 
ling those regulatory mechanisms could map to areas 
away from the S gene. Studies based on recombina¬ 
tion between TGEVs with enteric and respiratory tro¬ 
pism will help to identify the existence of these genes. 

Based on nucleotide sequencing data (Fig. 2), an 
evolutionary tree has been proposed which provides a 
relationship among 13 PRCV and TGEV isolates (Fig. 
4). Since we are dealing with a limited number of iso¬ 
lates from each continent, it is understood that the in¬ 
clusion of additional sequences from isolates of other 
areas (i.e., Japan) could show that certain lateral 
branches may become main branches. Only one iso¬ 
late (NEB72) was out of place. According to the evolu¬ 
tionary tree, it should have been isolated at the same 
time as the PUR46 strain, since it has accumulated a 
similar number of point mutations and has the same 
6-nt deletion which is present in the PUR46 strains. 
NEB72 probably represents a virus reintroduction, as 
the ones described in other viral systems (Beck and 
Strohmaier, 1987; Carrillo etal., 1990). Least squares 
estimation of the origin of TGEV related coronaviruses 
demonstrates a significant constancy in the fixation of 
mutations with time, that is, the existence of a well-de¬ 
fined molecular clock (Kimura, 1983). The mutation fix¬ 
ation rate is of 0.95 ± 0.05 substitutions per year. As 
this rate was measured for 1260 nt, it can be ex¬ 
pressed as 7 ± 2 X 10 -4 substitutions per nucleotide 
and per year. This rate falls in the range reported for 
other RNA viruses (Domingo and Holland, 1988). The 
direction defined for the evolutionary process from the 
predicted origin supports the occurrence of two dele¬ 
tions: one of 6 nt in the lineage from the root to PUR46 
strains and another of 672 nt in the lineage leading 
from TGEV to PRCVs. It may be concluded that the 
European PRCVs have been derived by a 672-nt dele- 
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tion from an enteric TGEV, since we have examined 
isolates preceding the PRCVs. In contrast, it cannot be 
guaranteed that the PUR46 emerged by a 6-nt deletion 
from an unknown ancestor. An alternative explanation 
could be that the other enteric isolates shown in Fig. 4 
could have been derived from PUR46 by the addition of 
6 nt. 

It is interesting to note that the area deleted in the 
TGEV S gene to form the PRCVs contain repeated tetra- 
meric (TTCC) or heptameric (AGTTTCC) sequences. 
These repeated sequences could be involved in inter- 
or intramolecular recombinations, by a copy choice 
mechanism (Lai, 1992), which could have originated 
the deletion. In coronaviruses and other RNA viruses 
containing positive-strand RNA genomes, recombi¬ 
nant clones have been isolated with borders at the 
crossover sites containing some sequence similarity 
(Banner and Lai, 1991; Raffo and Dawson, 1991; Cas- 
cone et at., 1990). Since the putative crossover ob¬ 
served in the generation of PRCVs does not happen at 
homologous sequences, the deletion might have been 
originated by nonhomologous recombination. This 
mechanism has been previously involved in the evolu¬ 
tion of coronaviruses, Sindbis virus, and plant viruses 
(Bannerand Lai, 1991; Monroe and Schlesinger, 1983; 
Bujarski and Dzianott, 1991). If recombination has 
been the cause of the deletion present in the PUR46 
and PRCVs, then two mechanisms of evolution would 
be involved in the antigenic variation of TGEV, point 
mutations and recombination. 
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